Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 41237 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 4.4 MiB |
| Average record size in memory | 112.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 2 |
address is highly correlated with name and 2 other fields | High correlation |
name is highly correlated with address | High correlation |
location is highly correlated with address | High correlation |
reviews_list is highly correlated with address and 1 other fields | High correlation |
city is highly correlated with reviews_list | High correlation |
address is highly correlated with name and 2 other fields | High correlation |
name is highly correlated with address | High correlation |
rate is highly correlated with votes | High correlation |
votes is highly correlated with rate | High correlation |
location is highly correlated with address | High correlation |
reviews_list is highly correlated with address and 1 other fields | High correlation |
city is highly correlated with reviews_list | High correlation |
address is highly correlated with name | High correlation |
name is highly correlated with address | High correlation |
rate is highly correlated with votes | High correlation |
votes is highly correlated with rate | High correlation |
reviews_list is highly correlated with city | High correlation |
city is highly correlated with reviews_list | High correlation |
reviews_list is highly correlated with location and 5 other fields | High correlation |
votes is highly correlated with rate | High correlation |
location is highly correlated with reviews_list and 5 other fields | High correlation |
name is highly correlated with reviews_list and 4 other fields | High correlation |
online_order is highly correlated with menu_item | High correlation |
rate is highly correlated with votes and 2 other fields | High correlation |
cost is highly correlated with rate and 2 other fields | High correlation |
cuisines is highly correlated with reviews_list and 3 other fields | High correlation |
rest_type is highly correlated with cost | High correlation |
city is highly correlated with reviews_list and 4 other fields | High correlation |
book_table is highly correlated with rate and 1 other fields | High correlation |
menu_item is highly correlated with reviews_list and 4 other fields | High correlation |
address is highly correlated with reviews_list and 5 other fields | High correlation |
location has 744 (1.8%) zeros | Zeros |
rest_type has 9608 (23.3%) zeros | Zeros |
menu_item has 30317 (73.5%) zeros | Zeros |
type has 847 (2.1%) zeros | Zeros |
city has 727 (1.8%) zeros | Zeros |
Reproduction
| Analysis started | 2021-09-25 06:10:44.470205 |
|---|---|
| Analysis finished | 2021-09-25 06:11:15.209925 |
| Duration | 30.74 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 8792 |
|---|---|
| Distinct (%) | 21.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3602.695007 |
| Minimum | 0 |
|---|---|
| Maximum | 8791 |
| Zeros | 9 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 497 |
| Q1 | 2071 |
| median | 3435 |
| Q3 | 5134 |
| 95-th percentile | 7590 |
| Maximum | 8791 |
| Range | 8791 |
| Interquartile range (IQR) | 3063 |
Descriptive statistics
| Standard deviation | 2194.930805 |
|---|---|
| Coefficient of variation (CV) | 0.6092469111 |
| Kurtosis | -0.6904106917 |
| Mean | 3602.695007 |
| Median Absolute Deviation (MAD) | 1496 |
| Skewness | 0.4448163439 |
| Sum | 148564334 |
| Variance | 4817721.237 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1002 | 86 | 0.2% |
| 2090 | 61 | 0.1% |
| 2080 | 49 | 0.1% |
| 3962 | 47 | 0.1% |
| 546 | 43 | 0.1% |
| 2115 | 41 | 0.1% |
| 2906 | 39 | 0.1% |
| 2100 | 38 | 0.1% |
| 2118 | 37 | 0.1% |
| 2078 | 36 | 0.1% |
| Other values (8782) | 40760 |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 1 | 4 | < 0.1% |
| 2 | 11 | |
| 3 | 2 | < 0.1% |
| 4 | 4 | < 0.1% |
| 5 | 7 | |
| 6 | 3 | < 0.1% |
| 7 | 5 | |
| 8 | 7 | |
| 9 | 7 |
| Value | Count | Frequency (%) |
| 8791 | 2 | |
| 8790 | 1 | < 0.1% |
| 8789 | 1 | < 0.1% |
| 8788 | 1 | < 0.1% |
| 8787 | 1 | < 0.1% |
| 8786 | 3 | |
| 8785 | 1 | < 0.1% |
| 8784 | 1 | < 0.1% |
| 8783 | 1 | < 0.1% |
| 8782 | 1 | < 0.1% |
| Distinct | 6572 |
|---|---|
| Distinct (%) | 15.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2412.058903 |
| Minimum | 0 |
|---|---|
| Maximum | 6571 |
| Zeros | 11 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 130 |
| Q1 | 1001 |
| median | 2140 |
| Q3 | 3480 |
| 95-th percentile | 5566 |
| Maximum | 6571 |
| Range | 6571 |
| Interquartile range (IQR) | 2479 |
Descriptive statistics
| Standard deviation | 1675.999112 |
|---|---|
| Coefficient of variation (CV) | 0.6948417012 |
| Kurtosis | -0.6322970111 |
| Mean | 2412.058903 |
| Median Absolute Deviation (MAD) | 1250 |
| Skewness | 0.5127225667 |
| Sum | 99466073 |
| Variance | 2808973.023 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 21 | 86 | 0.2% |
| 7 | 85 | 0.2% |
| 37 | 69 | 0.2% |
| 56 | 68 | 0.2% |
| 101 | 68 | 0.2% |
| 132 | 68 | 0.2% |
| 411 | 62 | 0.2% |
| 117 | 60 | 0.1% |
| 48 | 60 | 0.1% |
| 161 | 60 | 0.1% |
| Other values (6562) | 40551 |
| Value | Count | Frequency (%) |
| 0 | 11 | < 0.1% |
| 1 | 4 | < 0.1% |
| 2 | 11 | < 0.1% |
| 3 | 2 | < 0.1% |
| 4 | 4 | < 0.1% |
| 5 | 7 | < 0.1% |
| 6 | 3 | < 0.1% |
| 7 | 85 | |
| 8 | 7 | < 0.1% |
| 9 | 7 | < 0.1% |
| Value | Count | Frequency (%) |
| 6571 | 1 | < 0.1% |
| 6570 | 1 | < 0.1% |
| 6569 | 2 | |
| 6568 | 3 | |
| 6567 | 1 | < 0.1% |
| 6566 | 1 | < 0.1% |
| 6565 | 1 | < 0.1% |
| 6564 | 1 | < 0.1% |
| 6563 | 1 | < 0.1% |
| 6562 | 1 | < 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 322.3 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 41237 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 27081 | |
| 1 | 14156 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0 | 27081 | |
| 1 | 14156 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 27081 | |
| 1 | 14156 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 41237 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 27081 | |
| 1 | 14156 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 41237 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 27081 | |
| 1 | 14156 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 41237 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 27081 | |
| 1 | 14156 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 322.3 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 41237 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 34938 | |
| 0 | 6299 | 15.3% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 1 | 34938 | |
| 0 | 6299 | 15.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 34938 | |
| 0 | 6299 | 15.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 41237 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 34938 | |
| 0 | 6299 | 15.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 41237 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 34938 | |
| 0 | 6299 | 15.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 41237 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 34938 | |
| 0 | 6299 | 15.3% |
| Distinct | 31 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.702029731 |
| Minimum | 1.8 |
|---|---|
| Maximum | 4.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 1.8 |
|---|---|
| 5-th percentile | 2.9 |
| Q1 | 3.4 |
| median | 3.7 |
| Q3 | 4 |
| 95-th percentile | 4.4 |
| Maximum | 4.9 |
| Range | 3.1 |
| Interquartile range (IQR) | 0.6 |
Descriptive statistics
| Standard deviation | 0.4400344597 |
|---|---|
| Coefficient of variation (CV) | 0.118863027 |
| Kurtosis | -0.0001100394791 |
| Mean | 3.702029731 |
| Median Absolute Deviation (MAD) | 0.3 |
| Skewness | -0.3279338391 |
| Sum | 152660.6 |
| Variance | 0.1936303257 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=31)
| Value | Count | Frequency (%) |
| 3.9 | 3954 | 9.6% |
| 3.8 | 3816 | 9.3% |
| 3.7 | 3807 | 9.2% |
| 3.6 | 3286 | 8.0% |
| 4 | 3144 | 7.6% |
| 4.1 | 2925 | 7.1% |
| 3.5 | 2763 | 6.7% |
| 3.4 | 2444 | 5.9% |
| 3.3 | 2272 | 5.5% |
| 4.2 | 2154 | 5.2% |
| Other values (21) | 10672 |
| Value | Count | Frequency (%) |
| 1.8 | 5 | < 0.1% |
| 2 | 11 | < 0.1% |
| 2.1 | 24 | 0.1% |
| 2.2 | 26 | 0.1% |
| 2.3 | 51 | 0.1% |
| 2.4 | 66 | 0.2% |
| 2.5 | 100 | 0.2% |
| 2.6 | 249 | |
| 2.7 | 303 | |
| 2.8 | 580 |
| Value | Count | Frequency (%) |
| 4.9 | 55 | 0.1% |
| 4.8 | 66 | 0.2% |
| 4.7 | 167 | 0.4% |
| 4.6 | 300 | 0.7% |
| 4.5 | 656 | 1.6% |
| 4.4 | 1146 | 2.8% |
| 4.3 | 1682 | |
| 4.2 | 2154 | |
| 4.1 | 2925 | |
| 4 | 3144 |
| Distinct | 2323 |
|---|---|
| Distinct (%) | 5.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 352.7720009 |
| Minimum | 0 |
|---|---|
| Maximum | 16832 |
| Zeros | 19 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 21 |
| median | 73 |
| Q3 | 277 |
| 95-th percentile | 1706 |
| Maximum | 16832 |
| Range | 16832 |
| Interquartile range (IQR) | 256 |
Descriptive statistics
| Standard deviation | 884.40923 |
|---|---|
| Coefficient of variation (CV) | 2.507027848 |
| Kurtosis | 73.40401088 |
| Mean | 352.7720009 |
| Median Absolute Deviation (MAD) | 64 |
| Skewness | 6.868364013 |
| Sum | 14547259 |
| Variance | 782179.6862 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4 | 1123 | 2.7% |
| 6 | 979 | 2.4% |
| 7 | 858 | 2.1% |
| 9 | 735 | 1.8% |
| 11 | 685 | 1.7% |
| 5 | 659 | 1.6% |
| 10 | 617 | 1.5% |
| 8 | 617 | 1.5% |
| 16 | 524 | 1.3% |
| 12 | 463 | 1.1% |
| Other values (2313) | 33977 |
| Value | Count | Frequency (%) |
| 0 | 19 | < 0.1% |
| 1 | 2 | < 0.1% |
| 2 | 10 | < 0.1% |
| 4 | 1123 | |
| 5 | 659 | |
| 6 | 979 | |
| 7 | 858 | |
| 8 | 617 | |
| 9 | 735 | |
| 10 | 617 |
| Value | Count | Frequency (%) |
| 16832 | 3 | |
| 16345 | 3 | |
| 14956 | 2 | |
| 14726 | 1 | < 0.1% |
| 14723 | 3 | |
| 14717 | 2 | |
| 14710 | 3 | |
| 14704 | 1 | < 0.1% |
| 14694 | 1 | < 0.1% |
| 14690 | 1 | < 0.1% |
| Distinct | 92 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29.32807915 |
| Minimum | 0 |
|---|---|
| Maximum | 91 |
| Zeros | 744 |
| Zeros (%) | 1.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 14 |
| median | 24 |
| Q3 | 39 |
| 95-th percentile | 76 |
| Maximum | 91 |
| Range | 91 |
| Interquartile range (IQR) | 25 |
Descriptive statistics
| Standard deviation | 20.30967238 |
|---|---|
| Coefficient of variation (CV) | 0.6924992351 |
| Kurtosis | 0.1818683075 |
| Mean | 29.32807915 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 0.9366931089 |
| Sum | 1209402 |
| Variance | 412.4827922 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 12 | 3873 | 9.4% |
| 18 | 2296 | 5.6% |
| 20 | 1993 | 4.8% |
| 28 | 1800 | 4.4% |
| 8 | 1710 | 4.1% |
| 3 | 1634 | 4.0% |
| 25 | 1568 | 3.8% |
| 24 | 1410 | 3.4% |
| 11 | 1226 | 3.0% |
| 21 | 1055 | 2.6% |
| Other values (82) | 22672 |
| Value | Count | Frequency (%) |
| 0 | 744 | |
| 1 | 595 | 1.4% |
| 2 | 17 | < 0.1% |
| 3 | 1634 | |
| 4 | 158 | 0.4% |
| 5 | 2 | < 0.1% |
| 6 | 62 | 0.2% |
| 7 | 9 | < 0.1% |
| 8 | 1710 | |
| 9 | 89 | 0.2% |
| Value | Count | Frequency (%) |
| 91 | 10 | < 0.1% |
| 90 | 1 | < 0.1% |
| 89 | 1 | < 0.1% |
| 88 | 8 | < 0.1% |
| 87 | 23 | 0.1% |
| 86 | 45 | 0.1% |
| 85 | 4 | < 0.1% |
| 84 | 24 | 0.1% |
| 83 | 3 | < 0.1% |
| 82 | 506 |
| Distinct | 87 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.038775857 |
| Minimum | 0 |
|---|---|
| Maximum | 86 |
| Zeros | 9608 |
| Zeros (%) | 23.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 2 |
| Q3 | 9 |
| 95-th percentile | 34 |
| Maximum | 86 |
| Range | 86 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 12.45662471 |
|---|---|
| Coefficient of variation (CV) | 1.549567364 |
| Kurtosis | 6.704845891 |
| Mean | 8.038775857 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 2.451919134 |
| Sum | 331495 |
| Variance | 155.1674992 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2 | 13871 | |
| 0 | 9608 | |
| 4 | 3368 | 8.2% |
| 9 | 1850 | 4.5% |
| 7 | 1666 | 4.0% |
| 13 | 1278 | 3.1% |
| 28 | 1092 | 2.6% |
| 12 | 704 | 1.7% |
| 17 | 640 | 1.6% |
| 15 | 639 | 1.5% |
| Other values (77) | 6521 |
| Value | Count | Frequency (%) |
| 0 | 9608 | |
| 1 | 173 | 0.4% |
| 2 | 13871 | |
| 3 | 310 | 0.8% |
| 4 | 3368 | 8.2% |
| 5 | 33 | 0.1% |
| 6 | 93 | 0.2% |
| 7 | 1666 | 4.0% |
| 8 | 180 | 0.4% |
| 9 | 1850 | 4.5% |
| Value | Count | Frequency (%) |
| 86 | 6 | < 0.1% |
| 85 | 2 | < 0.1% |
| 84 | 2 | < 0.1% |
| 83 | 16 | |
| 82 | 1 | < 0.1% |
| 81 | 4 | < 0.1% |
| 80 | 4 | < 0.1% |
| 79 | 4 | < 0.1% |
| 78 | 16 | |
| 77 | 5 | < 0.1% |
| Distinct | 2367 |
|---|---|
| Distinct (%) | 5.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 503.270946 |
| Minimum | 0 |
|---|---|
| Maximum | 2366 |
| Zeros | 89 |
| Zeros (%) | 0.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 53 |
| median | 253 |
| Q3 | 847 |
| 95-th percentile | 1779 |
| Maximum | 2366 |
| Range | 2366 |
| Interquartile range (IQR) | 794 |
Descriptive statistics
| Standard deviation | 576.9261429 |
|---|---|
| Coefficient of variation (CV) | 1.146352969 |
| Kurtosis | 0.6171637116 |
| Mean | 503.270946 |
| Median Absolute Deviation (MAD) | 220 |
| Skewness | 1.257729512 |
| Sum | 20753384 |
| Variance | 332843.7744 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 5 | 2107 | 5.1% |
| 38 | 1949 | 4.7% |
| 33 | 1231 | 3.0% |
| 10 | 620 | 1.5% |
| 26 | 613 | 1.5% |
| 29 | 600 | 1.5% |
| 71 | 561 | 1.4% |
| 69 | 545 | 1.3% |
| 63 | 513 | 1.2% |
| 53 | 409 | 1.0% |
| Other values (2357) | 32089 |
| Value | Count | Frequency (%) |
| 0 | 89 | 0.2% |
| 1 | 8 | < 0.1% |
| 2 | 11 | < 0.1% |
| 3 | 220 | 0.5% |
| 4 | 8 | < 0.1% |
| 5 | 2107 | |
| 6 | 7 | < 0.1% |
| 7 | 85 | 0.2% |
| 8 | 34 | 0.1% |
| 9 | 7 | < 0.1% |
| Value | Count | Frequency (%) |
| 2366 | 1 | < 0.1% |
| 2365 | 1 | < 0.1% |
| 2364 | 1 | < 0.1% |
| 2363 | 1 | < 0.1% |
| 2362 | 1 | < 0.1% |
| 2361 | 1 | < 0.1% |
| 2360 | 1 | < 0.1% |
| 2359 | 1 | < 0.1% |
| 2358 | 2 | |
| 2357 | 3 |
| Distinct | 63 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 369.5862587 |
| Minimum | 1 |
|---|---|
| Maximum | 950 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1.2 |
| Q1 | 200 |
| median | 400 |
| Q3 | 500 |
| 95-th percentile | 800 |
| Maximum | 950 |
| Range | 949 |
| Interquartile range (IQR) | 300 |
Descriptive statistics
| Standard deviation | 242.522954 |
|---|---|
| Coefficient of variation (CV) | 0.6562012203 |
| Kurtosis | -0.7328152141 |
| Mean | 369.5862587 |
| Median Absolute Deviation (MAD) | 200 |
| Skewness | 0.1526906702 |
| Sum | 15240628.55 |
| Variance | 58817.38319 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 400 | 5261 | |
| 300 | 5242 | |
| 500 | 4080 | 9.9% |
| 600 | 3189 | 7.7% |
| 200 | 3163 | 7.7% |
| 250 | 2124 | 5.2% |
| 800 | 2078 | 5.0% |
| 700 | 1817 | 4.4% |
| 1 | 1515 | 3.7% |
| 350 | 1350 | 3.3% |
| Other values (53) | 11418 |
| Value | Count | Frequency (%) |
| 1 | 1515 | |
| 1.05 | 4 | < 0.1% |
| 1.1 | 490 | 1.2% |
| 1.2 | 968 | |
| 1.25 | 8 | < 0.1% |
| 1.3 | 511 | 1.2% |
| 1.35 | 18 | < 0.1% |
| 1.4 | 464 | 1.1% |
| 1.45 | 5 | < 0.1% |
| 1.5 | 907 |
| Value | Count | Frequency (%) |
| 950 | 60 | 0.1% |
| 900 | 667 | 1.6% |
| 850 | 149 | 0.4% |
| 800 | 2078 | |
| 750 | 741 | 1.8% |
| 700 | 1817 | |
| 650 | 743 | 1.8% |
| 600 | 3189 | |
| 550 | 704 | 1.7% |
| 500 | 4080 |
| Distinct | 21103 |
|---|---|
| Distinct (%) | 51.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8651.516623 |
| Minimum | 0 |
|---|---|
| Maximum | 21102 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 313 |
| Q1 | 3135 |
| median | 7403 |
| Q3 | 14111 |
| 95-th percentile | 19524.2 |
| Maximum | 21102 |
| Range | 21102 |
| Interquartile range (IQR) | 10976 |
Descriptive statistics
| Standard deviation | 6237.739506 |
|---|---|
| Coefficient of variation (CV) | 0.72099954 |
| Kurtosis | -1.151311161 |
| Mean | 8651.516623 |
| Median Absolute Deviation (MAD) | 4898 |
| Skewness | 0.3633891954 |
| Sum | 356762591 |
| Variance | 38909394.15 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 89 | 1111 | 2.7% |
| 2714 | 21 | 0.1% |
| 5634 | 20 | < 0.1% |
| 4608 | 20 | < 0.1% |
| 2721 | 19 | < 0.1% |
| 567 | 19 | < 0.1% |
| 5388 | 19 | < 0.1% |
| 2958 | 18 | < 0.1% |
| 2699 | 18 | < 0.1% |
| 2653 | 18 | < 0.1% |
| Other values (21093) | 39954 |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 1 | 3 | |
| 2 | 6 | |
| 3 | 1 | < 0.1% |
| 4 | 4 | |
| 5 | 6 | |
| 6 | 3 | |
| 7 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 5 |
| Value | Count | Frequency (%) |
| 21102 | 1 | |
| 21101 | 1 | |
| 21100 | 1 | |
| 21099 | 1 | |
| 21098 | 1 | |
| 21097 | 1 | |
| 21096 | 1 | |
| 21095 | 1 | |
| 21094 | 1 | |
| 21093 | 1 |
| Distinct | 8243 |
|---|---|
| Distinct (%) | 20.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1064.507142 |
| Minimum | 0 |
|---|---|
| Maximum | 8242 |
| Zeros | 30317 |
| Zeros (%) | 73.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 550 |
| 95-th percentile | 6402.2 |
| Maximum | 8242 |
| Range | 8242 |
| Interquartile range (IQR) | 550 |
Descriptive statistics
| Standard deviation | 2125.430372 |
|---|---|
| Coefficient of variation (CV) | 1.996633267 |
| Kurtosis | 2.348151846 |
| Mean | 1064.507142 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.908787455 |
| Sum | 43897081 |
| Variance | 4517454.268 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 30317 | |
| 1801 | 11 | < 0.1% |
| 1659 | 9 | < 0.1% |
| 1796 | 9 | < 0.1% |
| 4210 | 8 | < 0.1% |
| 4571 | 8 | < 0.1% |
| 641 | 8 | < 0.1% |
| 4561 | 8 | < 0.1% |
| 1941 | 8 | < 0.1% |
| 821 | 8 | < 0.1% |
| Other values (8233) | 10843 | 26.3% |
| Value | Count | Frequency (%) |
| 0 | 30317 | |
| 1 | 1 | < 0.1% |
| 2 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 8242 | 1 | |
| 8241 | 1 | |
| 8240 | 1 | |
| 8239 | 1 | |
| 8238 | 2 | |
| 8237 | 1 | |
| 8236 | 1 | |
| 8235 | 1 | |
| 8234 | 1 | |
| 8233 | 1 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.80730897 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 847 |
| Zeros (%) | 2.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 4 |
| 95-th percentile | 4 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.17050684 |
|---|---|
| Coefficient of variation (CV) | 0.4169497736 |
| Kurtosis | -0.4717621162 |
| Mean | 2.80730897 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.2578737323 |
| Sum | 115765 |
| Variance | 1.370086261 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) |
| 2 | 20431 | |
| 4 | 14062 | |
| 3 | 2709 | 6.6% |
| 1 | 1511 | 3.7% |
| 5 | 1045 | 2.5% |
| 0 | 847 | 2.1% |
| 6 | 632 | 1.5% |
| Value | Count | Frequency (%) |
| 0 | 847 | 2.1% |
| 1 | 1511 | 3.7% |
| 2 | 20431 | |
| 3 | 2709 | 6.6% |
| 4 | 14062 | |
| 5 | 1045 | 2.5% |
| 6 | 632 | 1.5% |
| Value | Count | Frequency (%) |
| 6 | 632 | 1.5% |
| 5 | 1045 | 2.5% |
| 4 | 14062 | |
| 3 | 2709 | 6.6% |
| 2 | 20431 | |
| 1 | 1511 | 3.7% |
| 0 | 847 | 2.1% |
| Distinct | 30 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.48478308 |
| Minimum | 0 |
|---|---|
| Maximum | 29 |
| Zeros | 727 |
| Zeros (%) | 1.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 322.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 7 |
| median | 15 |
| Q3 | 20 |
| 95-th percentile | 28 |
| Maximum | 29 |
| Range | 29 |
| Interquartile range (IQR) | 13 |
Descriptive statistics
| Standard deviation | 7.990329817 |
|---|---|
| Coefficient of variation (CV) | 0.5516361392 |
| Kurtosis | -1.011899506 |
| Mean | 14.48478308 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 0.007137381607 |
| Sum | 597309 |
| Variance | 63.84537058 |
| Monotonicity | Increasing |
Histogram with fixed size bins (bins=30)
| Value | Count | Frequency (%) |
| 6 | 2580 | 6.3% |
| 19 | 2361 | 5.7% |
| 16 | 2254 | 5.5% |
| 17 | 2250 | 5.5% |
| 18 | 2121 | 5.1% |
| 12 | 1915 | 4.6% |
| 13 | 1633 | 4.0% |
| 11 | 1537 | 3.7% |
| 7 | 1512 | 3.7% |
| 23 | 1510 | 3.7% |
| Other values (20) | 21564 |
| Value | Count | Frequency (%) |
| 0 | 727 | 1.8% |
| 1 | 1208 | |
| 2 | 1072 | |
| 3 | 956 | 2.3% |
| 4 | 1483 | |
| 5 | 1139 | |
| 6 | 2580 | |
| 7 | 1512 | |
| 8 | 818 | 2.0% |
| 9 | 953 | 2.3% |
| Value | Count | Frequency (%) |
| 29 | 1201 | |
| 28 | 1018 | |
| 27 | 1345 | |
| 26 | 872 | |
| 25 | 1173 | |
| 24 | 569 | 1.4% |
| 23 | 1510 | |
| 22 | 1293 | |
| 21 | 946 | |
| 20 | 1449 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| address | name | online_order | book_table | rate | votes | location | rest_type | cuisines | cost | reviews_list | menu_item | type | city | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 | 4.1 | 775 | 0 | 0 | 0 | 800.0 | 0 | 0 | 0 | 0 |
| 1 | 1 | 1 | 0 | 1 | 4.1 | 787 | 0 | 0 | 1 | 800.0 | 1 | 0 | 0 | 0 |
| 2 | 2 | 2 | 0 | 1 | 3.8 | 918 | 0 | 1 | 2 | 800.0 | 2 | 0 | 0 | 0 |
| 3 | 3 | 3 | 1 | 1 | 3.7 | 88 | 0 | 2 | 3 | 300.0 | 3 | 0 | 0 | 0 |
| 4 | 4 | 4 | 1 | 1 | 3.8 | 166 | 1 | 0 | 4 | 600.0 | 4 | 0 | 0 | 0 |
| 5 | 5 | 5 | 0 | 1 | 3.8 | 286 | 1 | 0 | 5 | 600.0 | 5 | 0 | 0 | 0 |
| 6 | 6 | 6 | 1 | 1 | 3.6 | 8 | 2 | 0 | 6 | 800.0 | 6 | 0 | 0 | 0 |
| 7 | 7 | 7 | 0 | 0 | 4.6 | 2556 | 0 | 3 | 7 | 600.0 | 7 | 0 | 1 | 0 |
| 8 | 8 | 8 | 0 | 1 | 4.0 | 324 | 0 | 4 | 8 | 700.0 | 8 | 0 | 1 | 0 |
| 9 | 9 | 9 | 0 | 1 | 4.2 | 504 | 0 | 4 | 9 | 550.0 | 9 | 0 | 1 | 0 |
Last rows
| address | name | online_order | book_table | rate | votes | location | rest_type | cuisines | cost | reviews_list | menu_item | type | city | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 41227 | 8702 | 2372 | 1 | 0 | 4.0 | 189 | 25 | 42 | 1060 | 1.5 | 20919 | 0 | 6 | 29 |
| 41228 | 3315 | 2833 | 0 | 0 | 3.8 | 128 | 25 | 33 | 1211 | 1.2 | 4215 | 0 | 6 | 29 |
| 41229 | 8735 | 6527 | 1 | 1 | 3.7 | 27 | 25 | 11 | 204 | 1.2 | 20939 | 0 | 6 | 29 |
| 41230 | 2859 | 2509 | 1 | 1 | 3.9 | 77 | 25 | 68 | 237 | 2.0 | 3727 | 0 | 6 | 29 |
| 41231 | 2870 | 2512 | 1 | 1 | 2.8 | 161 | 25 | 28 | 1102 | 1.2 | 3730 | 0 | 6 | 29 |
| 41232 | 3137 | 2699 | 1 | 1 | 3.7 | 34 | 25 | 28 | 204 | 800.0 | 4028 | 0 | 6 | 29 |
| 41233 | 8791 | 1716 | 1 | 1 | 2.5 | 81 | 25 | 28 | 761 | 800.0 | 21082 | 0 | 6 | 29 |
| 41234 | 8725 | 6532 | 1 | 1 | 3.6 | 27 | 25 | 17 | 240 | 1.5 | 20956 | 0 | 6 | 29 |
| 41235 | 8786 | 6568 | 1 | 0 | 4.3 | 236 | 56 | 17 | 237 | 2.5 | 21054 | 0 | 6 | 29 |
| 41236 | 3444 | 6569 | 1 | 1 | 3.4 | 13 | 56 | 33 | 1870 | 1.5 | 21055 | 0 | 6 | 29 |